Explora: Tackling Corpus Analysis with a Distributed Architecture
نویسنده
چکیده
When analysing a corpus of software, researchers often ask questions that entail exploration and navigation, such as “what packages contain fat interfaces in open-source systems?”, “how consistently is the code being commented?” and “are naming conventions being followed?”. The answers to these questions can impact software maintainability and evolution. Software visualisation can be of aid to understanding and exploring the answers to such questions, but corpus visualisations are timeconsuming and difficult to achieve since they require large amounts of data to be processed. We tackle this constrain by using a distributed architecture. In this paper we propose an environment where researchers can build queries for their questions and afterwards rapidly visualise them. We elaborate on a proof-of-concept tool named Explora and we report early results when visualising Qualitas Corpus [4]. This paper uses colours in the figures. Please read a coloured printout of this paper for a better understanding.
منابع مشابه
Explora: Infrastructure for Scaling Up Software Visualisation to Corpora
Visualisation provides good support for software analysis. It copes with the intangible nature of software by providing concrete representations of it. By reducing the complexity of software, visualisations are especially useful when dealing with large amounts of code. One domain that usually deals with large amounts of source code data is empirical analysis. Although there are many tools for a...
متن کاملEXPLoRA-web: linkage analysis of quantitative trait loci using bulk segregant analysis
Identification of genomic regions associated with a phenotype of interest is a fundamental step toward solving questions in biology and improving industrial research. Bulk segregant analysis (BSA) combined with high-throughput sequencing is a technique to efficiently identify these genomic regions associated with a trait of interest. However, distinguishing true from spuriously linked genomic r...
متن کاملA Conversation Analysis of Ellipsis and Substitution in Global Business English Textbooks
Despite the body of research on textbook evaluation from the discourse analysis perspective, cohesive devices have rarely been analyzed in English for Specific Purposes (ESP) textbooks. The acquisition and use of cohesive devices is inherent to naturalistic communication, including business interactions. Hence, L2 learners of business English should be exposed to these devices through cohesion-...
متن کاملAdequacy of the Endometrial Samples Obtained by the Uterine Explora Device and Conventional Dilatation and Curettage: A Comparative Study
Aims. Our aim is to compare the adequacy and diagnostic yield of samples obtained by the endometrial Explora Sampler I-MX120 with endometrial specimens obtained by conventional dilatation and curettage (D&C). Methods. A total of 1270 endometrial samples were received in the histopathology laboratories at the King Khalid University Hospital, Riyadh, Saudi Arabia, between 2007 and 2010. In the ou...
متن کاملThe EUDICO Project, Multi Media Annotation over the Internet
In this paper we dsecribe a software environment that facilitates media annotation and analysis of media related corpora over the internet. We will describe the general architecture of this environment and we will introduce our Abstract Corpus Model with which we isolate corpora specific formats from the annotation and analysis tools. The main set of tools is described by giving examples of the...
متن کامل